{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "4d5a3ead",
   "metadata": {
    "origin_pos": 0
   },
   "source": [
    "# Convexity\n",
    ":label:`sec_convexity`\n",
    "\n",
    "Convexity plays a vital role in the design of optimization algorithms. \n",
    "This is largely due to the fact that it is much easier to analyze and test algorithms in such a context. \n",
    "In other words,\n",
    "if the algorithm performs poorly even in the convex setting,\n",
    "typically we should not hope to see great results otherwise. \n",
    "Furthermore, even though the optimization problems in deep learning are generally nonconvex, they often exhibit some properties of convex ones near local minima. This can lead to exciting new optimization variants such as :cite:`Izmailov.Podoprikhin.Garipov.ea.2018`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c12196ca",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-08-18T19:36:33.890461Z",
     "iopub.status.busy": "2023-08-18T19:36:33.890146Z",
     "iopub.status.idle": "2023-08-18T19:36:37.257769Z",
     "shell.execute_reply": "2023-08-18T19:36:37.256833Z"
    },
    "origin_pos": 2,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import numpy as np\n",
    "import torch\n",
    "from mpl_toolkits import mplot3d\n",
    "from d2l import torch as d2l"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc2d20f3",
   "metadata": {
    "origin_pos": 4
   },
   "source": [
    "## Definitions\n",
    "\n",
    "Before convex analysis,\n",
    "we need to define *convex sets* and *convex functions*.\n",
    "They lead to mathematical tools that are commonly applied to machine learning.\n",
    "\n",
    "\n",
    "### Convex Sets\n",
    "\n",
    "Sets are the basis of convexity. Simply put, a set $\\mathcal{X}$ in a vector space is *convex* if for any $a, b \\in \\mathcal{X}$ the line segment connecting $a$ and $b$ is also in $\\mathcal{X}$. In mathematical terms this means that for all $\\lambda \\in [0, 1]$ we have\n",
    "\n",
    "$$\\lambda  a + (1-\\lambda)  b \\in \\mathcal{X} \\textrm{ whenever } a, b \\in \\mathcal{X}.$$\n",
    "\n",
    "This sounds a bit abstract. Consider :numref:`fig_pacman`. The first set is not convex since there exist line segments that are not contained in it.\n",
    "The other two sets suffer no such problem.\n",
    "\n",
    "![The first set is nonconvex and the other two are convex.](../img/pacman.svg)\n",
    ":label:`fig_pacman`\n",
    "\n",
    "Definitions on their own are not particularly useful unless you can do something with them.\n",
    "In this case we can look at intersections as shown in :numref:`fig_convex_intersect`.\n",
    "Assume that $\\mathcal{X}$ and $\\mathcal{Y}$ are convex sets. Then $\\mathcal{X} \\cap \\mathcal{Y}$ is also convex. To see this, consider any $a, b \\in \\mathcal{X} \\cap \\mathcal{Y}$. Since $\\mathcal{X}$ and $\\mathcal{Y}$ are convex, the line segments connecting $a$ and $b$ are contained in both $\\mathcal{X}$ and $\\mathcal{Y}$. Given that, they also need to be contained in $\\mathcal{X} \\cap \\mathcal{Y}$, thus proving our theorem.\n",
    "\n",
    "![The intersection between two convex sets is convex.](../img/convex-intersect.svg)\n",
    ":label:`fig_convex_intersect`\n",
    "\n",
    "We can strengthen this result with little effort: given convex sets $\\mathcal{X}_i$, their intersection $\\cap_{i} \\mathcal{X}_i$ is convex.\n",
    "To see that the converse is not true, consider two disjoint sets $\\mathcal{X} \\cap \\mathcal{Y} = \\emptyset$. Now pick $a \\in \\mathcal{X}$ and $b \\in \\mathcal{Y}$. The line segment in :numref:`fig_nonconvex` connecting $a$ and $b$ needs to contain some part that is neither in $\\mathcal{X}$ nor in $\\mathcal{Y}$, since we assumed that $\\mathcal{X} \\cap \\mathcal{Y} = \\emptyset$. Hence the line segment is not in $\\mathcal{X} \\cup \\mathcal{Y}$ either, thus proving that in general unions of convex sets need not be convex.\n",
    "\n",
    "![The union of two convex sets need not be convex.](../img/nonconvex.svg)\n",
    ":label:`fig_nonconvex`\n",
    "\n",
    "Typically the problems in deep learning are defined on convex sets. For instance, $\\mathbb{R}^d$,\n",
    "the set of $d$-dimensional vectors of real numbers,\n",
    "is a convex set (after all, the line between any two points in $\\mathbb{R}^d$ remains in $\\mathbb{R}^d$). In some cases we work with variables of bounded length, such as balls of radius $r$ as defined by $\\{\\mathbf{x} | \\mathbf{x} \\in \\mathbb{R}^d \\textrm{ and } \\|\\mathbf{x}\\| \\leq r\\}$.\n",
    "\n",
    "### Convex Functions\n",
    "\n",
    "Now that we have convex sets we can introduce *convex functions* $f$.\n",
    "Given a convex set $\\mathcal{X}$, a function $f: \\mathcal{X} \\to \\mathbb{R}$ is *convex* if for all $x, x' \\in \\mathcal{X}$ and for all $\\lambda \\in [0, 1]$ we have\n",
    "\n",
    "$$\\lambda f(x) + (1-\\lambda) f(x') \\geq f(\\lambda x + (1-\\lambda) x').$$\n",
    "\n",
    "To illustrate this let's plot a few functions and check which ones satisfy the requirement.\n",
    "Below we define a few functions, both convex and nonconvex.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d9822f32",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-08-18T19:36:37.262247Z",
     "iopub.status.busy": "2023-08-18T19:36:37.261719Z",
     "iopub.status.idle": "2023-08-18T19:36:38.018137Z",
     "shell.execute_reply": "2023-08-18T19:36:38.017320Z"
    },
    "origin_pos": 5,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"539.503125pt\" height=\"197.398125pt\" viewBox=\"0 0 539.503125 197.398125\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
       " <metadata>\n",
       "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
       "   <cc:Work>\n",
       "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
       "    <dc:date>2023-08-18T19:36:37.931986</dc:date>\n",
       "    <dc:format>image/svg+xml</dc:format>\n",
       "    <dc:creator>\n",
       "     <cc:Agent>\n",
       "      <dc:title>Matplotlib v3.7.2, https://matplotlib.org/</dc:title>\n",
       "     </cc:Agent>\n",
       "    </dc:creator>\n",
       "   </cc:Work>\n",
       "  </rdf:RDF>\n",
       " </metadata>\n",
       " <defs>\n",
       "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
       " </defs>\n",
       " <g id=\"figure_1\">\n",
       "  <g id=\"patch_1\">\n",
       "   <path d=\"M 0 197.398125 \n",
       "L 539.503125 197.398125 \n",
       "L 539.503125 0 \n",
       "L 0 0 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "  </g>\n",
       "  <g id=\"axes_1\">\n",
       "   <g id=\"patch_2\">\n",
       "    <path d=\"M 30.103125 173.52 \n",
       "L 177.809007 173.52 \n",
       "L 177.809007 7.2 \n",
       "L 30.103125 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_1\">\n",
       "    <g id=\"xtick_1\">\n",
       "     <g id=\"line2d_1\">\n",
       "      <path d=\"M 36.817029 173.52 \n",
       "L 36.817029 7.2 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_2\">\n",
       "      <defs>\n",
       "       <path id=\"m1543edb625\" d=\"M 0 0 \n",
       "L 0 3.5 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"36.817029\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_1\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(29.445935 188.118438) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
       "L 4684 2272 \n",
       "L 4684 1741 \n",
       "L 678 1741 \n",
       "L 678 2272 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
       "L 3431 531 \n",
       "L 3431 0 \n",
       "L 469 0 \n",
       "L 469 531 \n",
       "Q 828 903 1448 1529 \n",
       "Q 2069 2156 2228 2338 \n",
       "Q 2531 2678 2651 2914 \n",
       "Q 2772 3150 2772 3378 \n",
       "Q 2772 3750 2511 3984 \n",
       "Q 2250 4219 1831 4219 \n",
       "Q 1534 4219 1204 4116 \n",
       "Q 875 4013 500 3803 \n",
       "L 500 4441 \n",
       "Q 881 4594 1212 4672 \n",
       "Q 1544 4750 1819 4750 \n",
       "Q 2544 4750 2975 4387 \n",
       "Q 3406 4025 3406 3419 \n",
       "Q 3406 3131 3298 2873 \n",
       "Q 3191 2616 2906 2266 \n",
       "Q 2828 2175 2409 1742 \n",
       "Q 1991 1309 1228 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_2\">\n",
       "     <g id=\"line2d_3\">\n",
       "      <path d=\"M 104.124334 173.52 \n",
       "L 104.124334 7.2 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_4\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"104.124334\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_2\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(100.943084 188.118438) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
       "Q 1547 4250 1301 3770 \n",
       "Q 1056 3291 1056 2328 \n",
       "Q 1056 1369 1301 889 \n",
       "Q 1547 409 2034 409 \n",
       "Q 2525 409 2770 889 \n",
       "Q 3016 1369 3016 2328 \n",
       "Q 3016 3291 2770 3770 \n",
       "Q 2525 4250 2034 4250 \n",
       "z\n",
       "M 2034 4750 \n",
       "Q 2819 4750 3233 4129 \n",
       "Q 3647 3509 3647 2328 \n",
       "Q 3647 1150 3233 529 \n",
       "Q 2819 -91 2034 -91 \n",
       "Q 1250 -91 836 529 \n",
       "Q 422 1150 422 2328 \n",
       "Q 422 3509 836 4129 \n",
       "Q 1250 4750 2034 4750 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_3\">\n",
       "     <g id=\"line2d_5\">\n",
       "      <path d=\"M 171.43164 173.52 \n",
       "L 171.43164 7.2 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_6\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"171.43164\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_3\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(168.25039 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_2\">\n",
       "    <g id=\"ytick_1\">\n",
       "     <g id=\"line2d_7\">\n",
       "      <path d=\"M 30.103125 165.96 \n",
       "L 177.809007 165.96 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_8\">\n",
       "      <defs>\n",
       "       <path id=\"mf84f3d80eb\" d=\"M 0 0 \n",
       "L -3.5 0 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"30.103125\" y=\"165.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_4\">\n",
       "      <!-- 0.0 -->\n",
       "      <g transform=\"translate(7.2 169.759219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2e\" d=\"M 684 794 \n",
       "L 1344 794 \n",
       "L 1344 0 \n",
       "L 684 0 \n",
       "L 684 794 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_2\">\n",
       "     <g id=\"line2d_9\">\n",
       "      <path d=\"M 30.103125 128.16 \n",
       "L 177.809007 128.16 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_10\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"30.103125\" y=\"128.16\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_5\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(7.2 131.959219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-35\" d=\"M 691 4666 \n",
       "L 3169 4666 \n",
       "L 3169 4134 \n",
       "L 1269 4134 \n",
       "L 1269 2991 \n",
       "Q 1406 3038 1543 3061 \n",
       "Q 1681 3084 1819 3084 \n",
       "Q 2600 3084 3056 2656 \n",
       "Q 3513 2228 3513 1497 \n",
       "Q 3513 744 3044 326 \n",
       "Q 2575 -91 1722 -91 \n",
       "Q 1428 -91 1123 -41 \n",
       "Q 819 9 494 109 \n",
       "L 494 744 \n",
       "Q 775 591 1075 516 \n",
       "Q 1375 441 1709 441 \n",
       "Q 2250 441 2565 725 \n",
       "Q 2881 1009 2881 1497 \n",
       "Q 2881 1984 2565 2268 \n",
       "Q 2250 2553 1709 2553 \n",
       "Q 1456 2553 1204 2497 \n",
       "Q 953 2441 691 2322 \n",
       "L 691 4666 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_3\">\n",
       "     <g id=\"line2d_11\">\n",
       "      <path d=\"M 30.103125 90.36 \n",
       "L 177.809007 90.36 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_12\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"30.103125\" y=\"90.36\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_6\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(7.2 94.159219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
       "L 1825 531 \n",
       "L 1825 4091 \n",
       "L 703 3866 \n",
       "L 703 4441 \n",
       "L 1819 4666 \n",
       "L 2450 4666 \n",
       "L 2450 531 \n",
       "L 3481 531 \n",
       "L 3481 0 \n",
       "L 794 0 \n",
       "L 794 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_4\">\n",
       "     <g id=\"line2d_13\">\n",
       "      <path d=\"M 30.103125 52.56 \n",
       "L 177.809007 52.56 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_14\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"30.103125\" y=\"52.56\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_7\">\n",
       "      <!-- 1.5 -->\n",
       "      <g transform=\"translate(7.2 56.359219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_5\">\n",
       "     <g id=\"line2d_15\">\n",
       "      <path d=\"M 30.103125 14.76 \n",
       "L 177.809007 14.76 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_16\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"30.103125\" y=\"14.76\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_8\">\n",
       "      <!-- 2.0 -->\n",
       "      <g transform=\"translate(7.2 18.559219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_17\">\n",
       "    <path d=\"M 36.817029 14.76 \n",
       "L 40.855467 32.359681 \n",
       "L 44.55737 47.53638 \n",
       "L 48.259272 61.798321 \n",
       "L 51.96117 75.145494 \n",
       "L 55.32654 86.48551 \n",
       "L 58.691902 97.069496 \n",
       "L 62.057268 106.8975 \n",
       "L 65.086094 115.096311 \n",
       "L 68.114924 122.682777 \n",
       "L 71.143756 129.656883 \n",
       "L 73.836048 135.342002 \n",
       "L 76.528339 140.54328 \n",
       "L 79.220631 145.260719 \n",
       "L 81.912923 149.494318 \n",
       "L 84.26868 152.801821 \n",
       "L 86.624435 155.73888 \n",
       "L 88.980191 158.305501 \n",
       "L 90.999409 160.21062 \n",
       "L 93.018628 161.84358 \n",
       "L 95.037849 163.20438 \n",
       "L 97.057067 164.29302 \n",
       "L 99.076287 165.1095 \n",
       "L 101.095506 165.65382 \n",
       "L 102.778188 165.89952 \n",
       "L 104.460871 165.95622 \n",
       "L 106.143553 165.82392 \n",
       "L 107.826236 165.50262 \n",
       "L 109.845455 164.86758 \n",
       "L 111.864674 163.96038 \n",
       "L 113.883893 162.78102 \n",
       "L 115.903113 161.3295 \n",
       "L 117.922332 159.60582 \n",
       "L 119.941551 157.60998 \n",
       "L 122.297305 154.937522 \n",
       "L 124.653063 151.894619 \n",
       "L 127.008818 148.481279 \n",
       "L 129.364574 144.6975 \n",
       "L 132.056866 139.91958 \n",
       "L 134.749157 134.657821 \n",
       "L 137.441449 128.912223 \n",
       "L 140.13374 122.682786 \n",
       "L 143.16257 115.096324 \n",
       "L 146.1914 106.8975 \n",
       "L 149.220226 98.086327 \n",
       "L 152.585596 87.577918 \n",
       "L 155.950958 76.313523 \n",
       "L 159.316324 64.29312 \n",
       "L 163.018227 50.1975 \n",
       "L 166.720129 35.187121 \n",
       "L 170.422027 19.261993 \n",
       "L 171.095104 16.268222 \n",
       "L 171.095104 16.268222 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_18\">\n",
       "    <path d=\"M 53.643855 80.91 \n",
       "L 137.777987 128.16 \n",
       "\" clip-path=\"url(#p4d56cb7f55)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_3\">\n",
       "    <path d=\"M 30.103125 173.52 \n",
       "L 30.103125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_4\">\n",
       "    <path d=\"M 177.809007 173.52 \n",
       "L 177.809007 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_5\">\n",
       "    <path d=\"M 30.103125 173.52 \n",
       "L 177.809007 173.52 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_6\">\n",
       "    <path d=\"M 30.103125 7.2 \n",
       "L 177.809007 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       "  <g id=\"axes_2\">\n",
       "   <g id=\"patch_7\">\n",
       "    <path d=\"M 207.350184 173.52 \n",
       "L 355.056066 173.52 \n",
       "L 355.056066 7.2 \n",
       "L 207.350184 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_3\">\n",
       "    <g id=\"xtick_4\">\n",
       "     <g id=\"line2d_19\">\n",
       "      <path d=\"M 214.064088 173.52 \n",
       "L 214.064088 7.2 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_20\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"214.064088\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_9\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(206.692994 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_5\">\n",
       "     <g id=\"line2d_21\">\n",
       "      <path d=\"M 281.371393 173.52 \n",
       "L 281.371393 7.2 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_22\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"281.371393\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_10\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(278.190143 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_6\">\n",
       "     <g id=\"line2d_23\">\n",
       "      <path d=\"M 348.678699 173.52 \n",
       "L 348.678699 7.2 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_24\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"348.678699\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_11\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(345.497449 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_4\">\n",
       "    <g id=\"ytick_6\">\n",
       "     <g id=\"line2d_25\">\n",
       "      <path d=\"M 207.350184 165.96 \n",
       "L 355.056066 165.96 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_26\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"207.350184\" y=\"165.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_12\">\n",
       "      <!-- −1.0 -->\n",
       "      <g transform=\"translate(176.067371 169.759219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"179.199219\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_7\">\n",
       "     <g id=\"line2d_27\">\n",
       "      <path d=\"M 207.350184 128.16 \n",
       "L 355.056066 128.16 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_28\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"207.350184\" y=\"128.16\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_13\">\n",
       "      <!-- −0.5 -->\n",
       "      <g transform=\"translate(176.067371 131.959219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"83.789062\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"147.412109\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"179.199219\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_8\">\n",
       "     <g id=\"line2d_29\">\n",
       "      <path d=\"M 207.350184 90.36 \n",
       "L 355.056066 90.36 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_30\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"207.350184\" y=\"90.36\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_14\">\n",
       "      <!-- 0.0 -->\n",
       "      <g transform=\"translate(184.447059 94.159219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_9\">\n",
       "     <g id=\"line2d_31\">\n",
       "      <path d=\"M 207.350184 52.56 \n",
       "L 355.056066 52.56 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_32\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"207.350184\" y=\"52.56\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_15\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(184.447059 56.359219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_10\">\n",
       "     <g id=\"line2d_33\">\n",
       "      <path d=\"M 207.350184 14.76 \n",
       "L 355.056066 14.76 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_34\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"207.350184\" y=\"14.76\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_16\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(184.447059 18.559219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_35\">\n",
       "    <path d=\"M 214.064088 14.76 \n",
       "L 214.73716 14.909179 \n",
       "L 215.410232 15.356127 \n",
       "L 216.083305 16.099077 \n",
       "L 216.756381 17.135113 \n",
       "L 217.76599 19.229413 \n",
       "L 218.775598 21.955074 \n",
       "L 219.785207 25.287884 \n",
       "L 221.131352 30.624264 \n",
       "L 222.477501 36.90271 \n",
       "L 224.160182 45.923418 \n",
       "L 226.179403 58.171073 \n",
       "L 228.871693 76.193931 \n",
       "L 235.938961 124.681663 \n",
       "L 237.958182 136.69576 \n",
       "L 239.640863 145.470019 \n",
       "L 240.987008 151.521667 \n",
       "L 242.333153 156.608774 \n",
       "L 243.342766 159.742252 \n",
       "L 244.352374 162.25987 \n",
       "L 245.361983 164.139306 \n",
       "L 246.035059 165.02924 \n",
       "L 246.708132 165.624484 \n",
       "L 247.381204 165.922694 \n",
       "L 248.054279 165.922694 \n",
       "L 248.727351 165.624484 \n",
       "L 249.400423 165.02924 \n",
       "L 250.073496 164.139306 \n",
       "L 250.74657 162.958204 \n",
       "L 251.756179 160.651107 \n",
       "L 252.765787 157.720098 \n",
       "L 253.775398 154.191196 \n",
       "L 255.121543 148.610814 \n",
       "L 256.46769 142.111763 \n",
       "L 258.150371 132.853516 \n",
       "L 260.169592 120.384388 \n",
       "L 262.861884 102.186453 \n",
       "L 269.256078 58.171093 \n",
       "L 271.275298 45.923431 \n",
       "L 272.95798 36.902728 \n",
       "L 274.304126 30.624278 \n",
       "L 275.650272 25.287902 \n",
       "L 276.996418 20.977748 \n",
       "L 278.006028 18.460126 \n",
       "L 279.015637 16.580694 \n",
       "L 279.688711 15.69076 \n",
       "L 280.361784 15.095516 \n",
       "L 281.034857 14.797306 \n",
       "L 281.70793 14.797306 \n",
       "L 282.381003 15.095516 \n",
       "L 283.054076 15.69076 \n",
       "L 283.727149 16.580694 \n",
       "L 284.400222 17.761796 \n",
       "L 285.409831 20.068897 \n",
       "L 286.419441 22.999906 \n",
       "L 287.42905 26.528809 \n",
       "L 288.775197 32.109199 \n",
       "L 290.121343 38.608237 \n",
       "L 291.804026 47.866498 \n",
       "L 293.823245 60.335619 \n",
       "L 296.515537 78.533561 \n",
       "L 302.90973 122.548923 \n",
       "L 304.92895 134.79656 \n",
       "L 306.611633 143.817272 \n",
       "L 307.95778 150.095722 \n",
       "L 309.303924 155.432102 \n",
       "L 310.650071 159.742256 \n",
       "L 311.65968 162.25987 \n",
       "L 312.66929 164.139306 \n",
       "L 313.342363 165.02924 \n",
       "L 314.015435 165.624484 \n",
       "L 314.688508 165.922694 \n",
       "L 315.361582 165.922694 \n",
       "L 316.034654 165.624484 \n",
       "L 316.707727 165.02924 \n",
       "L 317.380799 164.13931 \n",
       "L 318.053872 162.958209 \n",
       "L 319.063484 160.651103 \n",
       "L 320.073093 157.720094 \n",
       "L 321.082702 154.191196 \n",
       "L 322.42885 148.610796 \n",
       "L 323.774995 142.111749 \n",
       "L 325.457676 132.853516 \n",
       "L 327.476898 120.384358 \n",
       "L 330.169191 102.186414 \n",
       "L 336.563383 58.171073 \n",
       "L 338.5826 45.923449 \n",
       "L 340.265285 36.90271 \n",
       "L 341.61143 30.624287 \n",
       "L 342.957579 25.287884 \n",
       "L 344.303724 20.977748 \n",
       "L 345.313333 18.460126 \n",
       "L 346.322941 16.58069 \n",
       "L 346.996014 15.69076 \n",
       "L 347.669086 15.095521 \n",
       "L 348.342162 14.797306 \n",
       "L 348.342162 14.797306 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_36\">\n",
       "    <path d=\"M 230.890914 90.359999 \n",
       "L 315.025046 165.96 \n",
       "\" clip-path=\"url(#p3ff32109c2)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_8\">\n",
       "    <path d=\"M 207.350184 173.52 \n",
       "L 207.350184 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_9\">\n",
       "    <path d=\"M 355.056066 173.52 \n",
       "L 355.056066 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_10\">\n",
       "    <path d=\"M 207.350184 173.52 \n",
       "L 355.056066 173.52 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_11\">\n",
       "    <path d=\"M 207.350184 7.2 \n",
       "L 355.056066 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       "  <g id=\"axes_3\">\n",
       "   <g id=\"patch_12\">\n",
       "    <path d=\"M 384.597243 173.52 \n",
       "L 532.303125 173.52 \n",
       "L 532.303125 7.2 \n",
       "L 384.597243 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_5\">\n",
       "    <g id=\"xtick_7\">\n",
       "     <g id=\"line2d_37\">\n",
       "      <path d=\"M 391.311146 173.52 \n",
       "L 391.311146 7.2 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_38\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"391.311146\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_17\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(383.940053 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_8\">\n",
       "     <g id=\"line2d_39\">\n",
       "      <path d=\"M 458.618452 173.52 \n",
       "L 458.618452 7.2 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_40\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"458.618452\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_18\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(455.437202 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_9\">\n",
       "     <g id=\"line2d_41\">\n",
       "      <path d=\"M 525.925757 173.52 \n",
       "L 525.925757 7.2 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_42\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#m1543edb625\" x=\"525.925757\" y=\"173.52\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_19\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(522.744507 188.118438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_6\">\n",
       "    <g id=\"ytick_11\">\n",
       "     <g id=\"line2d_43\">\n",
       "      <path d=\"M 384.597243 157.411453 \n",
       "L 532.303125 157.411453 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_44\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"384.597243\" y=\"157.411453\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_20\">\n",
       "      <!-- 0.5 -->\n",
       "      <g transform=\"translate(361.694118 161.210672) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_12\">\n",
       "     <g id=\"line2d_45\">\n",
       "      <path d=\"M 384.597243 125.06014 \n",
       "L 532.303125 125.06014 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_46\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"384.597243\" y=\"125.06014\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_21\">\n",
       "      <!-- 1.0 -->\n",
       "      <g transform=\"translate(361.694118 128.859359) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_13\">\n",
       "     <g id=\"line2d_47\">\n",
       "      <path d=\"M 384.597243 92.708827 \n",
       "L 532.303125 92.708827 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_48\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"384.597243\" y=\"92.708827\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_22\">\n",
       "      <!-- 1.5 -->\n",
       "      <g transform=\"translate(361.694118 96.508046) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_14\">\n",
       "     <g id=\"line2d_49\">\n",
       "      <path d=\"M 384.597243 60.357514 \n",
       "L 532.303125 60.357514 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_50\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"384.597243\" y=\"60.357514\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_23\">\n",
       "      <!-- 2.0 -->\n",
       "      <g transform=\"translate(361.694118 64.156733) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-30\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_15\">\n",
       "     <g id=\"line2d_51\">\n",
       "      <path d=\"M 384.597243 28.006201 \n",
       "L 532.303125 28.006201 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_52\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mf84f3d80eb\" x=\"384.597243\" y=\"28.006201\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_24\">\n",
       "      <!-- 2.5 -->\n",
       "      <g transform=\"translate(361.694118 31.805419) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-2e\" x=\"63.623047\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-35\" x=\"95.410156\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_53\">\n",
       "    <path d=\"M 391.311146 165.96 \n",
       "L 398.041875 163.456644 \n",
       "L 404.436075 160.835002 \n",
       "L 410.493726 158.110753 \n",
       "L 416.21485 155.302677 \n",
       "L 421.935969 152.24548 \n",
       "L 427.320555 149.120777 \n",
       "L 432.705138 145.735825 \n",
       "L 437.753187 142.306822 \n",
       "L 442.801235 138.610756 \n",
       "L 447.849283 134.626827 \n",
       "L 452.560795 130.629018 \n",
       "L 457.272306 126.341338 \n",
       "L 461.983817 121.742767 \n",
       "L 466.358792 117.174613 \n",
       "L 470.733766 112.29966 \n",
       "L 475.108741 107.097315 \n",
       "L 479.483717 101.545579 \n",
       "L 483.858692 95.620998 \n",
       "L 488.233666 89.298533 \n",
       "L 492.272105 83.086174 \n",
       "L 496.310543 76.489661 \n",
       "L 500.348982 69.485237 \n",
       "L 504.38742 62.04769 \n",
       "L 508.425859 54.15023 \n",
       "L 512.464297 45.764427 \n",
       "L 516.502732 36.860069 \n",
       "L 520.541174 27.405084 \n",
       "L 524.579609 17.365457 \n",
       "L 525.589221 14.76 \n",
       "L 525.589221 14.76 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_54\">\n",
       "    <path d=\"M 408.137973 159.199411 \n",
       "L 492.272105 83.086174 \n",
       "\" clip-path=\"url(#p230321aa82)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_13\">\n",
       "    <path d=\"M 384.597243 173.52 \n",
       "L 384.597243 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_14\">\n",
       "    <path d=\"M 532.303125 173.52 \n",
       "L 532.303125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_15\">\n",
       "    <path d=\"M 384.597243 173.52 \n",
       "L 532.303125 173.52 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_16\">\n",
       "    <path d=\"M 384.597243 7.2 \n",
       "L 532.303125 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       " </g>\n",
       " <defs>\n",
       "  <clipPath id=\"p4d56cb7f55\">\n",
       "   <rect x=\"30.103125\" y=\"7.2\" width=\"147.705882\" height=\"166.32\"/>\n",
       "  </clipPath>\n",
       "  <clipPath id=\"p3ff32109c2\">\n",
       "   <rect x=\"207.350184\" y=\"7.2\" width=\"147.705882\" height=\"166.32\"/>\n",
       "  </clipPath>\n",
       "  <clipPath id=\"p230321aa82\">\n",
       "   <rect x=\"384.597243\" y=\"7.2\" width=\"147.705882\" height=\"166.32\"/>\n",
       "  </clipPath>\n",
       " </defs>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<Figure size 900x300 with 3 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "f = lambda x: 0.5 * x**2  # Convex\n",
    "g = lambda x: torch.cos(np.pi * x)  # Nonconvex\n",
    "h = lambda x: torch.exp(0.5 * x)  # Convex\n",
    "\n",
    "x, segment = torch.arange(-2, 2, 0.01), torch.tensor([-1.5, 1])\n",
    "d2l.use_svg_display()\n",
    "_, axes = d2l.plt.subplots(1, 3, figsize=(9, 3))\n",
    "for ax, func in zip(axes, [f, g, h]):\n",
    "    d2l.plot([x, segment], [func(x), func(segment)], axes=ax)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b1991f0b",
   "metadata": {
    "origin_pos": 6
   },
   "source": [
    "As expected, the cosine function is *nonconvex*, whereas the parabola and the exponential function are. Note that the requirement that $\\mathcal{X}$ is a convex set is necessary for the condition to make sense. Otherwise the outcome of $f(\\lambda x + (1-\\lambda) x')$ might not be well defined.\n",
    "\n",
    "\n",
    "### Jensen's Inequality\n",
    "\n",
    "Given a convex function $f$,\n",
    "one of the most useful mathematical tools\n",
    "is *Jensen's inequality*.\n",
    "It amounts to a generalization of the definition of convexity:\n",
    "\n",
    "$$\\sum_i \\alpha_i f(x_i)  \\geq f\\left(\\sum_i \\alpha_i x_i\\right)    \\textrm{ and }    E_X[f(X)]  \\geq f\\left(E_X[X]\\right),$$\n",
    ":eqlabel:`eq_jensens-inequality`\n",
    "\n",
    "where $\\alpha_i$ are nonnegative real numbers such that $\\sum_i \\alpha_i = 1$ and $X$ is a random variable.\n",
    "In other words, the expectation of a convex function is no less than the convex function of an expectation, where the latter is usually a simpler expression. \n",
    "To prove the first inequality we repeatedly apply the definition of convexity to one term in the sum at a time.\n",
    "\n",
    "\n",
    "One of the common applications of Jensen's inequality is\n",
    "to bound a more complicated expression by a simpler one.\n",
    "For example,\n",
    "its application can be\n",
    "with regard to the log-likelihood of partially observed random variables. That is, we use\n",
    "\n",
    "$$E_{Y \\sim P(Y)}[-\\log P(X \\mid Y)] \\geq -\\log P(X),$$\n",
    "\n",
    "since $\\int P(Y) P(X \\mid Y) dY = P(X)$.\n",
    "This can be used in variational methods. Here $Y$ is typically the unobserved random variable, $P(Y)$ is the best guess of how it might be distributed, and $P(X)$ is the distribution with $Y$ integrated out. For instance, in clustering $Y$ might be the cluster labels and $P(X \\mid Y)$ is the generative model when applying cluster labels.\n",
    "\n",
    "\n",
    "\n",
    "## Properties\n",
    "\n",
    "Convex functions have many useful properties. We describe a few commonly-used ones below.\n",
    "\n",
    "\n",
    "### Local Minima Are Global Minima\n",
    "\n",
    "First and foremost, the local minima of convex functions are also the global minima. \n",
    "We can prove it by contradiction as follows.\n",
    "\n",
    "Consider a convex function $f$ defined on a convex set $\\mathcal{X}$.\n",
    "Suppose that $x^{\\ast} \\in \\mathcal{X}$ is a local minimum:\n",
    "there exists a small positive value $p$ so that for $x \\in \\mathcal{X}$ that satisfies $0 < |x - x^{\\ast}| \\leq p$ we have $f(x^{\\ast}) < f(x)$.\n",
    "\n",
    "Assume that the local minimum $x^{\\ast}$\n",
    "is not the global minimum of $f$:\n",
    "there exists $x' \\in \\mathcal{X}$ for which $f(x') < f(x^{\\ast})$. \n",
    "There also exists \n",
    "$\\lambda \\in [0, 1)$ such as $\\lambda = 1 - \\frac{p}{|x^{\\ast} - x'|}$\n",
    "so that\n",
    "$0 < |\\lambda x^{\\ast} + (1-\\lambda) x' - x^{\\ast}| \\leq p$. \n",
    "\n",
    "However,\n",
    "according to the definition of convex functions, we have\n",
    "\n",
    "$$\\begin{aligned}\n",
    "    f(\\lambda x^{\\ast} + (1-\\lambda) x') &\\leq \\lambda f(x^{\\ast}) + (1-\\lambda) f(x') \\\\\n",
    "    &< \\lambda f(x^{\\ast}) + (1-\\lambda) f(x^{\\ast}) \\\\\n",
    "    &= f(x^{\\ast}),\n",
    "\\end{aligned}$$\n",
    "\n",
    "which contradicts with our statement that $x^{\\ast}$ is a local minimum.\n",
    "Therefore, there does not exist $x' \\in \\mathcal{X}$ for which $f(x') < f(x^{\\ast})$. The local minimum $x^{\\ast}$ is also the global minimum.\n",
    "\n",
    "For instance, the convex function $f(x) = (x-1)^2$ has a local minimum at $x=1$, which is also the global minimum.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "197137da",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-08-18T19:36:38.022503Z",
     "iopub.status.busy": "2023-08-18T19:36:38.021918Z",
     "iopub.status.idle": "2023-08-18T19:36:38.271902Z",
     "shell.execute_reply": "2023-08-18T19:36:38.270705Z"
    },
    "origin_pos": 7,
    "tab": [
     "pytorch"
    ]
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       "  \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"236.740625pt\" height=\"183.35625pt\" viewBox=\"0 0 236.740625 183.35625\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
       " <metadata>\n",
       "  <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
       "   <cc:Work>\n",
       "    <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
       "    <dc:date>2023-08-18T19:36:38.224433</dc:date>\n",
       "    <dc:format>image/svg+xml</dc:format>\n",
       "    <dc:creator>\n",
       "     <cc:Agent>\n",
       "      <dc:title>Matplotlib v3.7.2, https://matplotlib.org/</dc:title>\n",
       "     </cc:Agent>\n",
       "    </dc:creator>\n",
       "   </cc:Work>\n",
       "  </rdf:RDF>\n",
       " </metadata>\n",
       " <defs>\n",
       "  <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
       " </defs>\n",
       " <g id=\"figure_1\">\n",
       "  <g id=\"patch_1\">\n",
       "   <path d=\"M 0 183.35625 \n",
       "L 236.740625 183.35625 \n",
       "L 236.740625 0 \n",
       "L 0 0 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "  </g>\n",
       "  <g id=\"axes_1\">\n",
       "   <g id=\"patch_2\">\n",
       "    <path d=\"M 34.240625 145.8 \n",
       "L 229.540625 145.8 \n",
       "L 229.540625 7.2 \n",
       "L 34.240625 7.2 \n",
       "z\n",
       "\" style=\"fill: #ffffff\"/>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_1\">\n",
       "    <g id=\"xtick_1\">\n",
       "     <g id=\"line2d_1\">\n",
       "      <path d=\"M 43.117898 145.8 \n",
       "L 43.117898 7.2 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_2\">\n",
       "      <defs>\n",
       "       <path id=\"mb9ab10271c\" d=\"M 0 0 \n",
       "L 0 3.5 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#mb9ab10271c\" x=\"43.117898\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_1\">\n",
       "      <!-- −2 -->\n",
       "      <g transform=\"translate(35.746804 160.398438) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-2212\" d=\"M 678 2272 \n",
       "L 4684 2272 \n",
       "L 4684 1741 \n",
       "L 678 1741 \n",
       "L 678 2272 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "        <path id=\"DejaVuSans-32\" d=\"M 1228 531 \n",
       "L 3431 531 \n",
       "L 3431 0 \n",
       "L 469 0 \n",
       "L 469 531 \n",
       "Q 828 903 1448 1529 \n",
       "Q 2069 2156 2228 2338 \n",
       "Q 2531 2678 2651 2914 \n",
       "Q 2772 3150 2772 3378 \n",
       "Q 2772 3750 2511 3984 \n",
       "Q 2250 4219 1831 4219 \n",
       "Q 1534 4219 1204 4116 \n",
       "Q 875 4013 500 3803 \n",
       "L 500 4441 \n",
       "Q 881 4594 1212 4672 \n",
       "Q 1544 4750 1819 4750 \n",
       "Q 2544 4750 2975 4387 \n",
       "Q 3406 4025 3406 3419 \n",
       "Q 3406 3131 3298 2873 \n",
       "Q 3191 2616 2906 2266 \n",
       "Q 2828 2175 2409 1742 \n",
       "Q 1991 1309 1228 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-32\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_2\">\n",
       "     <g id=\"line2d_3\">\n",
       "      <path d=\"M 87.615505 145.8 \n",
       "L 87.615505 7.2 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_4\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mb9ab10271c\" x=\"87.615505\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_2\">\n",
       "      <!-- −1 -->\n",
       "      <g transform=\"translate(80.244412 160.398438) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-31\" d=\"M 794 531 \n",
       "L 1825 531 \n",
       "L 1825 4091 \n",
       "L 703 3866 \n",
       "L 703 4441 \n",
       "L 1819 4666 \n",
       "L 2450 4666 \n",
       "L 2450 531 \n",
       "L 3481 531 \n",
       "L 3481 0 \n",
       "L 794 0 \n",
       "L 794 531 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-2212\"/>\n",
       "       <use xlink:href=\"#DejaVuSans-31\" x=\"83.789062\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_3\">\n",
       "     <g id=\"line2d_5\">\n",
       "      <path d=\"M 132.113113 145.8 \n",
       "L 132.113113 7.2 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_6\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mb9ab10271c\" x=\"132.113113\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_3\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(128.931863 160.398438) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-30\" d=\"M 2034 4250 \n",
       "Q 1547 4250 1301 3770 \n",
       "Q 1056 3291 1056 2328 \n",
       "Q 1056 1369 1301 889 \n",
       "Q 1547 409 2034 409 \n",
       "Q 2525 409 2770 889 \n",
       "Q 3016 1369 3016 2328 \n",
       "Q 3016 3291 2770 3770 \n",
       "Q 2525 4250 2034 4250 \n",
       "z\n",
       "M 2034 4750 \n",
       "Q 2819 4750 3233 4129 \n",
       "Q 3647 3509 3647 2328 \n",
       "Q 3647 1150 3233 529 \n",
       "Q 2819 -91 2034 -91 \n",
       "Q 1250 -91 836 529 \n",
       "Q 422 1150 422 2328 \n",
       "Q 422 3509 836 4129 \n",
       "Q 1250 4750 2034 4750 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_4\">\n",
       "     <g id=\"line2d_7\">\n",
       "      <path d=\"M 176.61072 145.8 \n",
       "L 176.61072 7.2 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_8\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mb9ab10271c\" x=\"176.61072\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_4\">\n",
       "      <!-- 1 -->\n",
       "      <g transform=\"translate(173.42947 160.398438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-31\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"xtick_5\">\n",
       "     <g id=\"line2d_9\">\n",
       "      <path d=\"M 221.108328 145.8 \n",
       "L 221.108328 7.2 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_10\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mb9ab10271c\" x=\"221.108328\" y=\"145.8\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_5\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(217.927078 160.398438) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"text_6\">\n",
       "     <!-- x -->\n",
       "     <g transform=\"translate(128.93125 174.076563) scale(0.1 -0.1)\">\n",
       "      <defs>\n",
       "       <path id=\"DejaVuSans-78\" d=\"M 3513 3500 \n",
       "L 2247 1797 \n",
       "L 3578 0 \n",
       "L 2900 0 \n",
       "L 1881 1375 \n",
       "L 863 0 \n",
       "L 184 0 \n",
       "L 1544 1831 \n",
       "L 300 3500 \n",
       "L 978 3500 \n",
       "L 1906 2253 \n",
       "L 2834 3500 \n",
       "L 3513 3500 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "      </defs>\n",
       "      <use xlink:href=\"#DejaVuSans-78\"/>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"matplotlib.axis_2\">\n",
       "    <g id=\"ytick_1\">\n",
       "     <g id=\"line2d_11\">\n",
       "      <path d=\"M 34.240625 139.5 \n",
       "L 229.540625 139.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_12\">\n",
       "      <defs>\n",
       "       <path id=\"mbe49a8ee7b\" d=\"M 0 0 \n",
       "L -3.5 0 \n",
       "\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </defs>\n",
       "      <g>\n",
       "       <use xlink:href=\"#mbe49a8ee7b\" x=\"34.240625\" y=\"139.5\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_7\">\n",
       "      <!-- 0 -->\n",
       "      <g transform=\"translate(20.878125 143.299219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-30\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_2\">\n",
       "     <g id=\"line2d_13\">\n",
       "      <path d=\"M 34.240625 111.5 \n",
       "L 229.540625 111.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_14\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mbe49a8ee7b\" x=\"34.240625\" y=\"111.5\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_8\">\n",
       "      <!-- 2 -->\n",
       "      <g transform=\"translate(20.878125 115.299219) scale(0.1 -0.1)\">\n",
       "       <use xlink:href=\"#DejaVuSans-32\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_3\">\n",
       "     <g id=\"line2d_15\">\n",
       "      <path d=\"M 34.240625 83.5 \n",
       "L 229.540625 83.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_16\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mbe49a8ee7b\" x=\"34.240625\" y=\"83.5\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_9\">\n",
       "      <!-- 4 -->\n",
       "      <g transform=\"translate(20.878125 87.299219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-34\" d=\"M 2419 4116 \n",
       "L 825 1625 \n",
       "L 2419 1625 \n",
       "L 2419 4116 \n",
       "z\n",
       "M 2253 4666 \n",
       "L 3047 4666 \n",
       "L 3047 1625 \n",
       "L 3713 1625 \n",
       "L 3713 1100 \n",
       "L 3047 1100 \n",
       "L 3047 0 \n",
       "L 2419 0 \n",
       "L 2419 1100 \n",
       "L 313 1100 \n",
       "L 313 1709 \n",
       "L 2253 4666 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-34\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_4\">\n",
       "     <g id=\"line2d_17\">\n",
       "      <path d=\"M 34.240625 55.5 \n",
       "L 229.540625 55.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_18\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mbe49a8ee7b\" x=\"34.240625\" y=\"55.5\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_10\">\n",
       "      <!-- 6 -->\n",
       "      <g transform=\"translate(20.878125 59.299219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-36\" d=\"M 2113 2584 \n",
       "Q 1688 2584 1439 2293 \n",
       "Q 1191 2003 1191 1497 \n",
       "Q 1191 994 1439 701 \n",
       "Q 1688 409 2113 409 \n",
       "Q 2538 409 2786 701 \n",
       "Q 3034 994 3034 1497 \n",
       "Q 3034 2003 2786 2293 \n",
       "Q 2538 2584 2113 2584 \n",
       "z\n",
       "M 3366 4563 \n",
       "L 3366 3988 \n",
       "Q 3128 4100 2886 4159 \n",
       "Q 2644 4219 2406 4219 \n",
       "Q 1781 4219 1451 3797 \n",
       "Q 1122 3375 1075 2522 \n",
       "Q 1259 2794 1537 2939 \n",
       "Q 1816 3084 2150 3084 \n",
       "Q 2853 3084 3261 2657 \n",
       "Q 3669 2231 3669 1497 \n",
       "Q 3669 778 3244 343 \n",
       "Q 2819 -91 2113 -91 \n",
       "Q 1303 -91 875 529 \n",
       "Q 447 1150 447 2328 \n",
       "Q 447 3434 972 4092 \n",
       "Q 1497 4750 2381 4750 \n",
       "Q 2619 4750 2861 4703 \n",
       "Q 3103 4656 3366 4563 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-36\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"ytick_5\">\n",
       "     <g id=\"line2d_19\">\n",
       "      <path d=\"M 34.240625 27.5 \n",
       "L 229.540625 27.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #b0b0b0; stroke-width: 0.8; stroke-linecap: square\"/>\n",
       "     </g>\n",
       "     <g id=\"line2d_20\">\n",
       "      <g>\n",
       "       <use xlink:href=\"#mbe49a8ee7b\" x=\"34.240625\" y=\"27.5\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "     <g id=\"text_11\">\n",
       "      <!-- 8 -->\n",
       "      <g transform=\"translate(20.878125 31.299219) scale(0.1 -0.1)\">\n",
       "       <defs>\n",
       "        <path id=\"DejaVuSans-38\" d=\"M 2034 2216 \n",
       "Q 1584 2216 1326 1975 \n",
       "Q 1069 1734 1069 1313 \n",
       "Q 1069 891 1326 650 \n",
       "Q 1584 409 2034 409 \n",
       "Q 2484 409 2743 651 \n",
       "Q 3003 894 3003 1313 \n",
       "Q 3003 1734 2745 1975 \n",
       "Q 2488 2216 2034 2216 \n",
       "z\n",
       "M 1403 2484 \n",
       "Q 997 2584 770 2862 \n",
       "Q 544 3141 544 3541 \n",
       "Q 544 4100 942 4425 \n",
       "Q 1341 4750 2034 4750 \n",
       "Q 2731 4750 3128 4425 \n",
       "Q 3525 4100 3525 3541 \n",
       "Q 3525 3141 3298 2862 \n",
       "Q 3072 2584 2669 2484 \n",
       "Q 3125 2378 3379 2068 \n",
       "Q 3634 1759 3634 1313 \n",
       "Q 3634 634 3220 271 \n",
       "Q 2806 -91 2034 -91 \n",
       "Q 1263 -91 848 271 \n",
       "Q 434 634 434 1313 \n",
       "Q 434 1759 690 2068 \n",
       "Q 947 2378 1403 2484 \n",
       "z\n",
       "M 1172 3481 \n",
       "Q 1172 3119 1398 2916 \n",
       "Q 1625 2713 2034 2713 \n",
       "Q 2441 2713 2670 2916 \n",
       "Q 2900 3119 2900 3481 \n",
       "Q 2900 3844 2670 4047 \n",
       "Q 2441 4250 2034 4250 \n",
       "Q 1625 4250 1398 4047 \n",
       "Q 1172 3844 1172 3481 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       </defs>\n",
       "       <use xlink:href=\"#DejaVuSans-38\"/>\n",
       "      </g>\n",
       "     </g>\n",
       "    </g>\n",
       "    <g id=\"text_12\">\n",
       "     <!-- f(x) -->\n",
       "     <g transform=\"translate(14.798438 85.121094) rotate(-90) scale(0.1 -0.1)\">\n",
       "      <defs>\n",
       "       <path id=\"DejaVuSans-66\" d=\"M 2375 4863 \n",
       "L 2375 4384 \n",
       "L 1825 4384 \n",
       "Q 1516 4384 1395 4259 \n",
       "Q 1275 4134 1275 3809 \n",
       "L 1275 3500 \n",
       "L 2222 3500 \n",
       "L 2222 3053 \n",
       "L 1275 3053 \n",
       "L 1275 0 \n",
       "L 697 0 \n",
       "L 697 3053 \n",
       "L 147 3053 \n",
       "L 147 3500 \n",
       "L 697 3500 \n",
       "L 697 3744 \n",
       "Q 697 4328 969 4595 \n",
       "Q 1241 4863 1831 4863 \n",
       "L 2375 4863 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       <path id=\"DejaVuSans-28\" d=\"M 1984 4856 \n",
       "Q 1566 4138 1362 3434 \n",
       "Q 1159 2731 1159 2009 \n",
       "Q 1159 1288 1364 580 \n",
       "Q 1569 -128 1984 -844 \n",
       "L 1484 -844 \n",
       "Q 1016 -109 783 600 \n",
       "Q 550 1309 550 2009 \n",
       "Q 550 2706 781 3412 \n",
       "Q 1013 4119 1484 4856 \n",
       "L 1984 4856 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "       <path id=\"DejaVuSans-29\" d=\"M 513 4856 \n",
       "L 1013 4856 \n",
       "Q 1481 4119 1714 3412 \n",
       "Q 1947 2706 1947 2009 \n",
       "Q 1947 1309 1714 600 \n",
       "Q 1481 -109 1013 -844 \n",
       "L 513 -844 \n",
       "Q 928 -128 1133 580 \n",
       "Q 1338 1288 1338 2009 \n",
       "Q 1338 2731 1133 3434 \n",
       "Q 928 4138 513 4856 \n",
       "z\n",
       "\" transform=\"scale(0.015625)\"/>\n",
       "      </defs>\n",
       "      <use xlink:href=\"#DejaVuSans-66\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-28\" x=\"35.205078\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-78\" x=\"74.21875\"/>\n",
       "      <use xlink:href=\"#DejaVuSans-29\" x=\"133.398438\"/>\n",
       "     </g>\n",
       "    </g>\n",
       "   </g>\n",
       "   <g id=\"line2d_21\">\n",
       "    <path d=\"M 43.117898 13.5 \n",
       "L 48.902586 24.183409 \n",
       "L 54.687275 34.393602 \n",
       "L 60.471969 44.130608 \n",
       "L 65.811677 52.698601 \n",
       "L 71.15139 60.863407 \n",
       "L 76.491103 68.625 \n",
       "L 81.830817 75.983393 \n",
       "L 87.17053 82.938599 \n",
       "L 92.065267 88.960001 \n",
       "L 96.960002 94.642601 \n",
       "L 101.854739 99.986395 \n",
       "L 106.749477 104.991403 \n",
       "L 111.644214 109.657598 \n",
       "L 116.093973 113.605599 \n",
       "L 120.543735 117.2736 \n",
       "L 124.993496 120.661602 \n",
       "L 129.443256 123.769602 \n",
       "L 133.893017 126.5976 \n",
       "L 138.342778 129.1456 \n",
       "L 142.792538 131.4136 \n",
       "L 146.797323 133.2154 \n",
       "L 150.802109 134.7904 \n",
       "L 154.806892 136.1386 \n",
       "L 158.811678 137.26 \n",
       "L 162.816462 138.1546 \n",
       "L 166.821248 138.8224 \n",
       "L 170.826032 139.2634 \n",
       "L 174.830815 139.4776 \n",
       "L 178.835599 139.465 \n",
       "L 182.840385 139.2256 \n",
       "L 186.845171 138.7594 \n",
       "L 190.849952 138.066401 \n",
       "L 194.854738 137.1466 \n",
       "L 198.859524 136 \n",
       "L 202.86431 134.626599 \n",
       "L 206.869091 133.026401 \n",
       "L 210.873877 131.1994 \n",
       "L 214.878664 129.1456 \n",
       "L 219.32842 126.597602 \n",
       "L 220.663352 125.7786 \n",
       "L 220.663352 125.7786 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke: #1f77b4; stroke-width: 1.5; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"line2d_22\">\n",
       "    <path d=\"M 65.366702 52 \n",
       "L 176.61072 139.5 \n",
       "\" clip-path=\"url(#pbf218723b4)\" style=\"fill: none; stroke-dasharray: 5.55,2.4; stroke-dashoffset: 0; stroke: #bf00bf; stroke-width: 1.5\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_3\">\n",
       "    <path d=\"M 34.240625 145.8 \n",
       "L 34.240625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_4\">\n",
       "    <path d=\"M 229.540625 145.8 \n",
       "L 229.540625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_5\">\n",
       "    <path d=\"M 34.240625 145.8 \n",
       "L 229.540625 145.8 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "   <g id=\"patch_6\">\n",
       "    <path d=\"M 34.240625 7.2 \n",
       "L 229.540625 7.2 \n",
       "\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
       "   </g>\n",
       "  </g>\n",
       " </g>\n",
       " <defs>\n",
       "  <clipPath id=\"pbf218723b4\">\n",
       "   <rect x=\"34.240625\" y=\"7.2\" width=\"195.3\" height=\"138.6\"/>\n",
       "  </clipPath>\n",
       " </defs>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<Figure size 350x250 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "f = lambda x: (x - 1) ** 2\n",
    "d2l.set_figsize()\n",
    "d2l.plot([x, segment], [f(x), f(segment)], 'x', 'f(x)')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aa5bd7e",
   "metadata": {
    "origin_pos": 8
   },
   "source": [
    "The fact that the local minima for convex functions are also the global minima is very convenient. \n",
    "It means that if we minimize functions we cannot \"get stuck\". \n",
    "Note, though, that this does not mean that there cannot be more than one global minimum or that there might even exist one. For instance, the function $f(x) = \\mathrm{max}(|x|-1, 0)$ attains its minimum value over the interval $[-1, 1]$. Conversely, the function $f(x) = \\exp(x)$ does not attain a minimum value on $\\mathbb{R}$: for $x \\to -\\infty$ it asymptotes to $0$, but there is no $x$ for which $f(x) = 0$.\n",
    "\n",
    "### Below Sets of Convex Functions Are Convex\n",
    "\n",
    "We can conveniently \n",
    "define convex sets \n",
    "via *below sets* of convex functions.\n",
    "Concretely,\n",
    "given a convex function $f$ defined on a convex set $\\mathcal{X}$,\n",
    "any below set\n",
    "\n",
    "$$\\mathcal{S}_b \\stackrel{\\textrm{def}}{=} \\{x | x \\in \\mathcal{X} \\textrm{ and } f(x) \\leq b\\}$$\n",
    "\n",
    "is convex. \n",
    "\n",
    "Let's prove this quickly. Recall that for any $x, x' \\in \\mathcal{S}_b$ we need to show that $\\lambda x + (1-\\lambda) x' \\in \\mathcal{S}_b$ as long as $\\lambda \\in [0, 1]$. \n",
    "Since $f(x) \\leq b$ and $f(x') \\leq b$,\n",
    "by the definition of convexity we have \n",
    "\n",
    "$$f(\\lambda x + (1-\\lambda) x') \\leq \\lambda f(x) + (1-\\lambda) f(x') \\leq b.$$\n",
    "\n",
    "\n",
    "### Convexity and Second Derivatives\n",
    "\n",
    "Whenever the second derivative of a function $f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$ exists it is very easy to check whether $f$ is convex. \n",
    "All we need to do is check whether the Hessian of $f$ is positive semidefinite: $\\nabla^2f \\succeq 0$, i.e., \n",
    "denoting the Hessian matrix $\\nabla^2f$ by $\\mathbf{H}$,\n",
    "$\\mathbf{x}^\\top \\mathbf{H} \\mathbf{x} \\geq 0$\n",
    "for all $\\mathbf{x} \\in \\mathbb{R}^n$.\n",
    "For instance, the function $f(\\mathbf{x}) = \\frac{1}{2} \\|\\mathbf{x}\\|^2$ is convex since $\\nabla^2 f = \\mathbf{1}$, i.e., its Hessian is an identity matrix.\n",
    "\n",
    "\n",
    "Formally, a twice-differentiable one-dimensional function $f: \\mathbb{R} \\rightarrow \\mathbb{R}$ is convex\n",
    "if and only if its second derivative $f'' \\geq 0$. For any twice-differentiable multidimensional function $f: \\mathbb{R}^{n} \\rightarrow \\mathbb{R}$,\n",
    "it is convex if and only if its Hessian $\\nabla^2f \\succeq 0$.\n",
    "\n",
    "First, we need to prove the one-dimensional case.\n",
    "To see that \n",
    "convexity of $f$ implies \n",
    "$f'' \\geq 0$  we use the fact that\n",
    "\n",
    "$$\\frac{1}{2} f(x + \\epsilon) + \\frac{1}{2} f(x - \\epsilon) \\geq f\\left(\\frac{x + \\epsilon}{2} + \\frac{x - \\epsilon}{2}\\right) = f(x).$$\n",
    "\n",
    "Since the second derivative is given by the limit over finite differences it follows that\n",
    "\n",
    "$$f''(x) = \\lim_{\\epsilon \\to 0} \\frac{f(x+\\epsilon) + f(x - \\epsilon) - 2f(x)}{\\epsilon^2} \\geq 0.$$\n",
    "\n",
    "To see that \n",
    "$f'' \\geq 0$ implies that $f$ is convex\n",
    "we use the fact that $f'' \\geq 0$ implies that $f'$ is a monotonically nondecreasing function. Let $a < x < b$ be three points in $\\mathbb{R}$,\n",
    "where $x = (1-\\lambda)a + \\lambda b$ and $\\lambda \\in (0, 1)$.\n",
    "According to the mean value theorem,\n",
    "there exist $\\alpha \\in [a, x]$ and $\\beta \\in [x, b]$\n",
    "such that\n",
    "\n",
    "$$f'(\\alpha) = \\frac{f(x) - f(a)}{x-a} \\textrm{ and } f'(\\beta) = \\frac{f(b) - f(x)}{b-x}.$$\n",
    "\n",
    "\n",
    "By monotonicity $f'(\\beta) \\geq f'(\\alpha)$, hence\n",
    "\n",
    "$$\\frac{x-a}{b-a}f(b) + \\frac{b-x}{b-a}f(a) \\geq f(x).$$\n",
    "\n",
    "Since $x = (1-\\lambda)a + \\lambda b$,\n",
    "we have\n",
    "\n",
    "$$\\lambda f(b) + (1-\\lambda)f(a) \\geq f((1-\\lambda)a + \\lambda b),$$\n",
    "\n",
    "thus proving convexity.\n",
    "\n",
    "Second, we need a lemma before \n",
    "proving the multidimensional case:\n",
    "$f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$\n",
    "is convex if and only if for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$\n",
    "\n",
    "$$g(z) \\stackrel{\\textrm{def}}{=} f(z \\mathbf{x} + (1-z)  \\mathbf{y}) \\textrm{ where } z \\in [0,1]$$ \n",
    "\n",
    "is convex.\n",
    "\n",
    "To prove that convexity of $f$ implies that $g$ is convex,\n",
    "we can show that for all $a, b, \\lambda \\in [0, 1]$ (thus\n",
    "$0 \\leq \\lambda a + (1-\\lambda) b \\leq 1$)\n",
    "\n",
    "$$\\begin{aligned} &g(\\lambda a + (1-\\lambda) b)\\\\\n",
    "=&f\\left(\\left(\\lambda a + (1-\\lambda) b\\right)\\mathbf{x} + \\left(1-\\lambda a - (1-\\lambda) b\\right)\\mathbf{y} \\right)\\\\\n",
    "=&f\\left(\\lambda \\left(a \\mathbf{x} + (1-a)  \\mathbf{y}\\right)  + (1-\\lambda) \\left(b \\mathbf{x} + (1-b)  \\mathbf{y}\\right) \\right)\\\\\n",
    "\\leq& \\lambda f\\left(a \\mathbf{x} + (1-a)  \\mathbf{y}\\right)  + (1-\\lambda) f\\left(b \\mathbf{x} + (1-b)  \\mathbf{y}\\right) \\\\\n",
    "=& \\lambda g(a) + (1-\\lambda) g(b).\n",
    "\\end{aligned}$$\n",
    "\n",
    "To prove the converse,\n",
    "we can show that for \n",
    "all $\\lambda \\in [0, 1]$ \n",
    "\n",
    "$$\\begin{aligned} &f(\\lambda \\mathbf{x} + (1-\\lambda) \\mathbf{y})\\\\\n",
    "=&g(\\lambda \\cdot 1 + (1-\\lambda) \\cdot 0)\\\\\n",
    "\\leq& \\lambda g(1)  + (1-\\lambda) g(0) \\\\\n",
    "=& \\lambda f(\\mathbf{x}) + (1-\\lambda) f(\\mathbf{y}).\n",
    "\\end{aligned}$$\n",
    "\n",
    "\n",
    "Finally,\n",
    "using the lemma above and the result of the one-dimensional case,\n",
    "the multidimensional case\n",
    "can be proven as follows.\n",
    "A multidimensional function $f: \\mathbb{R}^n \\rightarrow \\mathbb{R}$ is convex\n",
    "if and only if for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$ $g(z) \\stackrel{\\textrm{def}}{=} f(z \\mathbf{x} + (1-z)  \\mathbf{y})$, where $z \\in [0,1]$,\n",
    "is convex.\n",
    "According to the one-dimensional case,\n",
    "this holds if and only if\n",
    "$g'' = (\\mathbf{x} - \\mathbf{y})^\\top \\mathbf{H}(\\mathbf{x} - \\mathbf{y}) \\geq 0$ ($\\mathbf{H} \\stackrel{\\textrm{def}}{=} \\nabla^2f$)\n",
    "for all $\\mathbf{x}, \\mathbf{y} \\in \\mathbb{R}^n$,\n",
    "which is equivalent to $\\mathbf{H} \\succeq 0$\n",
    "per the definition of positive semidefinite matrices.\n",
    "\n",
    "\n",
    "## Constraints\n",
    "\n",
    "One of the nice properties of convex optimization is that it allows us to handle constraints efficiently. That is, it allows us to solve *constrained optimization* problems of the form:\n",
    "\n",
    "$$\\begin{aligned} \\mathop{\\textrm{minimize~}}_{\\mathbf{x}} & f(\\mathbf{x}) \\\\\n",
    "    \\textrm{ subject to } & c_i(\\mathbf{x}) \\leq 0 \\textrm{ for all } i \\in \\{1, \\ldots, n\\},\n",
    "\\end{aligned}$$\n",
    "\n",
    "where $f$ is the objective and the functions $c_i$ are constraint functions. To see what this does consider the case where $c_1(\\mathbf{x}) = \\|\\mathbf{x}\\|_2 - 1$. In this case the parameters $\\mathbf{x}$ are constrained to the unit ball. If a second constraint is $c_2(\\mathbf{x}) = \\mathbf{v}^\\top \\mathbf{x} + b$, then this corresponds to all $\\mathbf{x}$ lying on a half-space. Satisfying both constraints simultaneously amounts to selecting a slice of a ball.\n",
    "\n",
    "### Lagrangian\n",
    "\n",
    "In general, solving a constrained optimization problem is difficult. One way of addressing it stems from physics with a rather simple intuition. Imagine a ball inside a box. The ball will roll to the place that is lowest and the forces of gravity will be balanced out with the forces that the sides of the box can impose on the ball. In short, the gradient of the objective function (i.e., gravity) will be offset by the gradient of the constraint function (the ball need to remain inside the box by virtue of the walls \"pushing back\"). \n",
    "Note that some constraints may not be active:\n",
    "the walls that are not touched by the ball\n",
    "will not be able to exert any force on the ball.\n",
    "\n",
    "\n",
    "Skipping over the derivation of the *Lagrangian* $L$,\n",
    "the above reasoning\n",
    "can be expressed via the following saddle point optimization problem:\n",
    "\n",
    "$$L(\\mathbf{x}, \\alpha_1, \\ldots, \\alpha_n) = f(\\mathbf{x}) + \\sum_{i=1}^n \\alpha_i c_i(\\mathbf{x}) \\textrm{ where } \\alpha_i \\geq 0.$$\n",
    "\n",
    "Here the variables $\\alpha_i$ ($i=1,\\ldots,n$) are the so-called *Lagrange multipliers* that ensure that constraints are properly enforced. They are chosen just large enough to ensure that $c_i(\\mathbf{x}) \\leq 0$ for all $i$. For instance, for any $\\mathbf{x}$ where $c_i(\\mathbf{x}) < 0$ naturally, we'd end up picking $\\alpha_i = 0$. Moreover, this is a saddle point optimization problem where one wants to *maximize* $L$ with respect to all $\\alpha_i$ and simultaneously *minimize* it with respect to $\\mathbf{x}$. There is a rich body of literature explaining how to arrive at the function $L(\\mathbf{x}, \\alpha_1, \\ldots, \\alpha_n)$. For our purposes it is sufficient to know that the saddle point of $L$ is where the original constrained optimization problem is solved optimally.\n",
    "\n",
    "### Penalties\n",
    "\n",
    "One way of satisfying constrained optimization problems at least *approximately* is to adapt the Lagrangian $L$. \n",
    "Rather than satisfying $c_i(\\mathbf{x}) \\leq 0$ we simply add $\\alpha_i c_i(\\mathbf{x})$ to the objective function $f(x)$. This ensures that the constraints will not be violated too badly.\n",
    "\n",
    "In fact, we have been using this trick all along. Consider weight decay in :numref:`sec_weight_decay`. In it we add $\\frac{\\lambda}{2} \\|\\mathbf{w}\\|^2$ to the objective function to ensure that $\\mathbf{w}$ does not grow too large. From the constrained optimization point of view we can see that this will ensure that $\\|\\mathbf{w}\\|^2 - r^2 \\leq 0$ for some radius $r$. Adjusting the value of $\\lambda$ allows us to vary the size of $\\mathbf{w}$.\n",
    "\n",
    "In general, adding penalties is a good way of ensuring approximate constraint satisfaction. In practice this turns out to be much more robust than exact satisfaction. Furthermore, for nonconvex problems many of the properties that make the exact approach so appealing in the convex case (e.g., optimality) no longer hold.\n",
    "\n",
    "### Projections\n",
    "\n",
    "An alternative strategy for satisfying constraints is projections. Again, we encountered them before, e.g., when dealing with gradient clipping in :numref:`sec_rnn-scratch`. There we ensured that a gradient has length bounded by $\\theta$ via\n",
    "\n",
    "$$\\mathbf{g} \\leftarrow \\mathbf{g} \\cdot \\mathrm{min}(1, \\theta/\\|\\mathbf{g}\\|).$$\n",
    "\n",
    "This turns out to be a *projection* of $\\mathbf{g}$ onto the ball of radius $\\theta$. More generally, a projection on a convex set $\\mathcal{X}$ is defined as\n",
    "\n",
    "$$\\textrm{Proj}_\\mathcal{X}(\\mathbf{x}) = \\mathop{\\mathrm{argmin}}_{\\mathbf{x}' \\in \\mathcal{X}} \\|\\mathbf{x} - \\mathbf{x}'\\|,$$\n",
    "\n",
    "which is the closest point in $\\mathcal{X}$ to $\\mathbf{x}$. \n",
    "\n",
    "![Convex Projections.](../img/projections.svg)\n",
    ":label:`fig_projections`\n",
    "\n",
    "The mathematical definition of projections may sound a bit abstract. :numref:`fig_projections` explains it somewhat more clearly. In it we have two convex sets, a circle and a diamond. \n",
    "Points inside both sets (yellow) remain unchanged during projections. \n",
    "Points outside both sets (black) are projected to \n",
    "the points inside the sets (red) that are closet to the original points (black).\n",
    "While for $\\ell_2$ balls this leaves the direction unchanged, this need not be the case in general, as can be seen in the case of the diamond.\n",
    "\n",
    "\n",
    "One of the uses for convex projections is to compute sparse weight vectors. In this case we project weight vectors onto an $\\ell_1$ ball,\n",
    "which is a generalized version of the diamond case in :numref:`fig_projections`.\n",
    "\n",
    "\n",
    "## Summary\n",
    "\n",
    "In the context of deep learning the main purpose of convex functions is to motivate optimization algorithms and help us understand them in detail. In the following we will see how gradient descent and stochastic gradient descent can be derived accordingly.\n",
    "\n",
    "\n",
    "* Intersections of convex sets are convex. Unions are not.\n",
    "* The expectation of a convex function is no less than the convex function of an expectation (Jensen's inequality).\n",
    "* A twice-differentiable function is convex if and only if its Hessian (a matrix of second derivatives) is positive semidefinite.\n",
    "* Convex constraints can be added via the Lagrangian. In practice we may simply add them with a penalty to the objective function.\n",
    "* Projections map to points in the convex set closest to the original points.\n",
    "\n",
    "## Exercises\n",
    "\n",
    "1. Assume that we want to verify convexity of a set by drawing all lines between points within the set and checking whether the lines are contained.\n",
    "    1. Prove that it is sufficient to check only the points on the boundary.\n",
    "    1. Prove that it is sufficient to check only the vertices of the set.\n",
    "1. Denote by $\\mathcal{B}_p[r] \\stackrel{\\textrm{def}}{=} \\{\\mathbf{x} | \\mathbf{x} \\in \\mathbb{R}^d \\textrm{ and } \\|\\mathbf{x}\\|_p \\leq r\\}$ the ball of radius $r$ using the $p$-norm. Prove that $\\mathcal{B}_p[r]$ is convex for all $p \\geq 1$.\n",
    "1. Given convex functions $f$ and $g$, show that $\\mathrm{max}(f, g)$ is convex, too. Prove that $\\mathrm{min}(f, g)$ is not convex.\n",
    "1. Prove that the normalization of the softmax function is convex. More specifically prove the convexity of\n",
    "    $f(x) = \\log \\sum_i \\exp(x_i)$.\n",
    "1. Prove that linear subspaces, i.e., $\\mathcal{X} = \\{\\mathbf{x} | \\mathbf{W} \\mathbf{x} = \\mathbf{b}\\}$, are convex sets.\n",
    "1. Prove that in the case of linear subspaces with $\\mathbf{b} = \\mathbf{0}$ the projection $\\textrm{Proj}_\\mathcal{X}$ can be written as $\\mathbf{M} \\mathbf{x}$ for some matrix $\\mathbf{M}$.\n",
    "1. Show that for  twice-differentiable convex functions $f$ we can write $f(x + \\epsilon) = f(x) + \\epsilon f'(x) + \\frac{1}{2} \\epsilon^2 f''(x + \\xi)$ for some $\\xi \\in [0, \\epsilon]$.\n",
    "1. Given a convex set $\\mathcal{X}$ and two vectors $\\mathbf{x}$ and $\\mathbf{y}$, prove that projections never increase distances, i.e., $\\|\\mathbf{x} - \\mathbf{y}\\| \\geq \\|\\textrm{Proj}_\\mathcal{X}(\\mathbf{x}) - \\textrm{Proj}_\\mathcal{X}(\\mathbf{y})\\|$.\n",
    "\n",
    "\n",
    "[Discussions](https://discuss.d2l.ai/t/350)\n"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  },
  "required_libs": []
 },
 "nbformat": 4,
 "nbformat_minor": 5
}